{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "03f45ce3-90d2-489e-93b6-36f8d15c7314",
   "metadata": {},
   "source": [
    "简介：该数据集包含了2019年纽约市的Airbnb上线的房间情况。Airbnb是一个旅行房屋租赁社区，用户可通过网站或手机APP发布、搜索度假房屋租赁信息并在线预定。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "eb8cf2cd-fb8e-4dc5-8cf1-9afc8f6711f7",
   "metadata": {},
   "source": [
    "变量含义：\n",
    "- id：房间id\n",
    "- name：房间名称\n",
    "- host_id：房东id\n",
    "- host_name：房东姓名\n",
    "- neighbourhood_group：地区\n",
    "- neighbourhood：街区\n",
    "- latitude：纬度坐标\n",
    "- longitude：经度坐标\n",
    "- room_type：房间类型\n",
    "- price：价格（美元）\n",
    "- minimum_nights：最少预定夜晚数\n",
    "- number_of_reviews：评论数量\n",
    "- last_review：最新浏览\n",
    "- reviews_per_month：每月浏览次数\n",
    "- calculated_host_listings_count：房东挂出房子的数量\n",
    "- availability_365：可预定房源的天数"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "a3c9f7fc-fb55-4d6b-a5d5-75d6d67a8eb4",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "6ac676d4-9649-4793-8f93-b509cd75fefe",
   "metadata": {},
   "outputs": [],
   "source": [
    "original_data = pd.read_csv(\"airbnb_NYC_2019.csv\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "92c645d3-afeb-46be-b3bf-8479e53955b3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>name</th>\n",
       "      <th>host_id</th>\n",
       "      <th>host_name</th>\n",
       "      <th>neighbourhood_group</th>\n",
       "      <th>neighbourhood</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>room_type</th>\n",
       "      <th>price</th>\n",
       "      <th>minimum_nights</th>\n",
       "      <th>number_of_reviews</th>\n",
       "      <th>last_review</th>\n",
       "      <th>reviews_per_month</th>\n",
       "      <th>calculated_host_listings_count</th>\n",
       "      <th>availability_365</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2539</td>\n",
       "      <td>Clean &amp; quiet apt home by the park</td>\n",
       "      <td>2787</td>\n",
       "      <td>John</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Kensington</td>\n",
       "      <td>40.64749</td>\n",
       "      <td>-73.97237</td>\n",
       "      <td>Private room</td>\n",
       "      <td>149</td>\n",
       "      <td>1</td>\n",
       "      <td>9</td>\n",
       "      <td>2018-10-19</td>\n",
       "      <td>0.21</td>\n",
       "      <td>6</td>\n",
       "      <td>365</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2595</td>\n",
       "      <td>Skylit Midtown Castle</td>\n",
       "      <td>2845</td>\n",
       "      <td>Jennifer</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Midtown</td>\n",
       "      <td>40.75362</td>\n",
       "      <td>-73.98377</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>225</td>\n",
       "      <td>1</td>\n",
       "      <td>45</td>\n",
       "      <td>2019-05-21</td>\n",
       "      <td>0.38</td>\n",
       "      <td>2</td>\n",
       "      <td>355</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3647</td>\n",
       "      <td>THE VILLAGE OF HARLEM....NEW YORK !</td>\n",
       "      <td>4632</td>\n",
       "      <td>Elisabeth</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Harlem</td>\n",
       "      <td>40.80902</td>\n",
       "      <td>-73.94190</td>\n",
       "      <td>Private room</td>\n",
       "      <td>150</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>365</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3831</td>\n",
       "      <td>Cozy Entire Floor of Brownstone</td>\n",
       "      <td>4869</td>\n",
       "      <td>LisaRoxanne</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Clinton Hill</td>\n",
       "      <td>40.68514</td>\n",
       "      <td>-73.95976</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>89</td>\n",
       "      <td>1</td>\n",
       "      <td>270</td>\n",
       "      <td>2019-07-05</td>\n",
       "      <td>4.64</td>\n",
       "      <td>1</td>\n",
       "      <td>194</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5022</td>\n",
       "      <td>Entire Apt: Spacious Studio/Loft by central park</td>\n",
       "      <td>7192</td>\n",
       "      <td>Laura</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>East Harlem</td>\n",
       "      <td>40.79851</td>\n",
       "      <td>-73.94399</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>80</td>\n",
       "      <td>10</td>\n",
       "      <td>9</td>\n",
       "      <td>2018-11-19</td>\n",
       "      <td>0.10</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>5099</td>\n",
       "      <td>Large Cozy 1 BR Apartment In Midtown East</td>\n",
       "      <td>7322</td>\n",
       "      <td>Chris</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Murray Hill</td>\n",
       "      <td>40.74767</td>\n",
       "      <td>-73.97500</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>200</td>\n",
       "      <td>3</td>\n",
       "      <td>74</td>\n",
       "      <td>2019-06-22</td>\n",
       "      <td>0.59</td>\n",
       "      <td>1</td>\n",
       "      <td>129</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>5121</td>\n",
       "      <td>BlissArtsSpace!</td>\n",
       "      <td>7356</td>\n",
       "      <td>Garon</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Bedford-Stuyvesant</td>\n",
       "      <td>40.68688</td>\n",
       "      <td>-73.95596</td>\n",
       "      <td>Private room</td>\n",
       "      <td>60</td>\n",
       "      <td>45</td>\n",
       "      <td>49</td>\n",
       "      <td>2017-10-05</td>\n",
       "      <td>0.40</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>5178</td>\n",
       "      <td>Large Furnished Room Near B'way</td>\n",
       "      <td>8967</td>\n",
       "      <td>Shunichi</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Hell's Kitchen</td>\n",
       "      <td>40.76489</td>\n",
       "      <td>-73.98493</td>\n",
       "      <td>Private room</td>\n",
       "      <td>79</td>\n",
       "      <td>2</td>\n",
       "      <td>430</td>\n",
       "      <td>2019-06-24</td>\n",
       "      <td>3.47</td>\n",
       "      <td>1</td>\n",
       "      <td>220</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>5203</td>\n",
       "      <td>Cozy Clean Guest Room - Family Apt</td>\n",
       "      <td>7490</td>\n",
       "      <td>MaryEllen</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Upper West Side</td>\n",
       "      <td>40.80178</td>\n",
       "      <td>-73.96723</td>\n",
       "      <td>Private room</td>\n",
       "      <td>79</td>\n",
       "      <td>2</td>\n",
       "      <td>118</td>\n",
       "      <td>2017-07-21</td>\n",
       "      <td>0.99</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>5238</td>\n",
       "      <td>Cute &amp; Cozy Lower East Side 1 bdrm</td>\n",
       "      <td>7549</td>\n",
       "      <td>Ben</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Chinatown</td>\n",
       "      <td>40.71344</td>\n",
       "      <td>-73.99037</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>150</td>\n",
       "      <td>1</td>\n",
       "      <td>160</td>\n",
       "      <td>2019-06-09</td>\n",
       "      <td>1.33</td>\n",
       "      <td>4</td>\n",
       "      <td>188</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     id                                              name  host_id  \\\n",
       "0  2539                Clean & quiet apt home by the park     2787   \n",
       "1  2595                             Skylit Midtown Castle     2845   \n",
       "2  3647               THE VILLAGE OF HARLEM....NEW YORK !     4632   \n",
       "3  3831                   Cozy Entire Floor of Brownstone     4869   \n",
       "4  5022  Entire Apt: Spacious Studio/Loft by central park     7192   \n",
       "5  5099         Large Cozy 1 BR Apartment In Midtown East     7322   \n",
       "6  5121                                   BlissArtsSpace!     7356   \n",
       "7  5178                  Large Furnished Room Near B'way      8967   \n",
       "8  5203                Cozy Clean Guest Room - Family Apt     7490   \n",
       "9  5238                Cute & Cozy Lower East Side 1 bdrm     7549   \n",
       "\n",
       "     host_name neighbourhood_group       neighbourhood  latitude  longitude  \\\n",
       "0         John            Brooklyn          Kensington  40.64749  -73.97237   \n",
       "1     Jennifer           Manhattan             Midtown  40.75362  -73.98377   \n",
       "2    Elisabeth           Manhattan              Harlem  40.80902  -73.94190   \n",
       "3  LisaRoxanne            Brooklyn        Clinton Hill  40.68514  -73.95976   \n",
       "4        Laura           Manhattan         East Harlem  40.79851  -73.94399   \n",
       "5        Chris           Manhattan         Murray Hill  40.74767  -73.97500   \n",
       "6        Garon            Brooklyn  Bedford-Stuyvesant  40.68688  -73.95596   \n",
       "7     Shunichi           Manhattan      Hell's Kitchen  40.76489  -73.98493   \n",
       "8    MaryEllen           Manhattan     Upper West Side  40.80178  -73.96723   \n",
       "9          Ben           Manhattan           Chinatown  40.71344  -73.99037   \n",
       "\n",
       "         room_type  price  minimum_nights  number_of_reviews last_review  \\\n",
       "0     Private room    149               1                  9  2018-10-19   \n",
       "1  Entire home/apt    225               1                 45  2019-05-21   \n",
       "2     Private room    150               3                  0         NaN   \n",
       "3  Entire home/apt     89               1                270  2019-07-05   \n",
       "4  Entire home/apt     80              10                  9  2018-11-19   \n",
       "5  Entire home/apt    200               3                 74  2019-06-22   \n",
       "6     Private room     60              45                 49  2017-10-05   \n",
       "7     Private room     79               2                430  2019-06-24   \n",
       "8     Private room     79               2                118  2017-07-21   \n",
       "9  Entire home/apt    150               1                160  2019-06-09   \n",
       "\n",
       "   reviews_per_month  calculated_host_listings_count  availability_365  \n",
       "0               0.21                               6               365  \n",
       "1               0.38                               2               355  \n",
       "2                NaN                               1               365  \n",
       "3               4.64                               1               194  \n",
       "4               0.10                               1                 0  \n",
       "5               0.59                               1               129  \n",
       "6               0.40                               1                 0  \n",
       "7               3.47                               1               220  \n",
       "8               0.99                               1                 0  \n",
       "9               1.33                               4               188  "
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "original_data.head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "130522ab-2348-435b-89fa-daa61f556d62",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>name</th>\n",
       "      <th>host_id</th>\n",
       "      <th>host_name</th>\n",
       "      <th>neighbourhood_group</th>\n",
       "      <th>neighbourhood</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>room_type</th>\n",
       "      <th>price</th>\n",
       "      <th>minimum_nights</th>\n",
       "      <th>number_of_reviews</th>\n",
       "      <th>last_review</th>\n",
       "      <th>reviews_per_month</th>\n",
       "      <th>calculated_host_listings_count</th>\n",
       "      <th>availability_365</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>15705</th>\n",
       "      <td>12685482</td>\n",
       "      <td>Manhattan - Near Subway!</td>\n",
       "      <td>68929870</td>\n",
       "      <td>Ben</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Harlem</td>\n",
       "      <td>40.82617</td>\n",
       "      <td>-73.95173</td>\n",
       "      <td>Private room</td>\n",
       "      <td>69</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36531</th>\n",
       "      <td>29043928</td>\n",
       "      <td>Brooklyn Sanctuary</td>\n",
       "      <td>218883687</td>\n",
       "      <td>Parrish</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>East Flatbush</td>\n",
       "      <td>40.63931</td>\n",
       "      <td>-73.94451</td>\n",
       "      <td>Private room</td>\n",
       "      <td>125</td>\n",
       "      <td>1</td>\n",
       "      <td>7</td>\n",
       "      <td>2019-01-01</td>\n",
       "      <td>0.76</td>\n",
       "      <td>2</td>\n",
       "      <td>364</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>813</th>\n",
       "      <td>289665</td>\n",
       "      <td>Decorators 5-Star Flat West Village</td>\n",
       "      <td>1503831</td>\n",
       "      <td>Elle</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>West Village</td>\n",
       "      <td>40.72891</td>\n",
       "      <td>-74.00293</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>450</td>\n",
       "      <td>20</td>\n",
       "      <td>157</td>\n",
       "      <td>2016-08-11</td>\n",
       "      <td>1.71</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9541</th>\n",
       "      <td>7327833</td>\n",
       "      <td>Charming Apartment in Crown Heights</td>\n",
       "      <td>27630423</td>\n",
       "      <td>Julianne</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Crown Heights</td>\n",
       "      <td>40.67464</td>\n",
       "      <td>-73.94276</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>120</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>2015-10-03</td>\n",
       "      <td>0.04</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>47041</th>\n",
       "      <td>35569459</td>\n",
       "      <td>Luxury Full Floor SoHo Loft | 3 Bed/2 Bath</td>\n",
       "      <td>163029687</td>\n",
       "      <td>Anna + Jason</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>SoHo</td>\n",
       "      <td>40.72434</td>\n",
       "      <td>-73.99945</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>900</td>\n",
       "      <td>3</td>\n",
       "      <td>1</td>\n",
       "      <td>2019-07-06</td>\n",
       "      <td>1.00</td>\n",
       "      <td>1</td>\n",
       "      <td>292</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7601</th>\n",
       "      <td>5715049</td>\n",
       "      <td>Park Block Studio</td>\n",
       "      <td>29628369</td>\n",
       "      <td>Nick</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Park Slope</td>\n",
       "      <td>40.66568</td>\n",
       "      <td>-73.97740</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>181</td>\n",
       "      <td>2</td>\n",
       "      <td>238</td>\n",
       "      <td>2019-06-23</td>\n",
       "      <td>4.57</td>\n",
       "      <td>1</td>\n",
       "      <td>302</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>36447</th>\n",
       "      <td>28978474</td>\n",
       "      <td>New + Sunlit Apartment in Bed-Stuy</td>\n",
       "      <td>34394667</td>\n",
       "      <td>Ron</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Bedford-Stuyvesant</td>\n",
       "      <td>40.67782</td>\n",
       "      <td>-73.92326</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>125</td>\n",
       "      <td>3</td>\n",
       "      <td>15</td>\n",
       "      <td>2019-06-24</td>\n",
       "      <td>4.33</td>\n",
       "      <td>1</td>\n",
       "      <td>175</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32709</th>\n",
       "      <td>25773391</td>\n",
       "      <td>Beautiful Room in Fort Greene</td>\n",
       "      <td>3687035</td>\n",
       "      <td>Abeyu</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Fort Greene</td>\n",
       "      <td>40.68464</td>\n",
       "      <td>-73.97094</td>\n",
       "      <td>Private room</td>\n",
       "      <td>75</td>\n",
       "      <td>3</td>\n",
       "      <td>9</td>\n",
       "      <td>2019-07-03</td>\n",
       "      <td>0.75</td>\n",
       "      <td>1</td>\n",
       "      <td>7</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>10744</th>\n",
       "      <td>8269318</td>\n",
       "      <td>Dreamy Private Bedroom Near Manhattan NYC.</td>\n",
       "      <td>22384027</td>\n",
       "      <td>Shahana</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Crown Heights</td>\n",
       "      <td>40.67112</td>\n",
       "      <td>-73.91685</td>\n",
       "      <td>Private room</td>\n",
       "      <td>39</td>\n",
       "      <td>1</td>\n",
       "      <td>73</td>\n",
       "      <td>2019-05-10</td>\n",
       "      <td>1.57</td>\n",
       "      <td>10</td>\n",
       "      <td>33</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>40332</th>\n",
       "      <td>31283533</td>\n",
       "      <td>New York City Home w/ Full Amenities</td>\n",
       "      <td>39179346</td>\n",
       "      <td>Rooshabh</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Midtown</td>\n",
       "      <td>40.75414</td>\n",
       "      <td>-73.96595</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>110</td>\n",
       "      <td>1</td>\n",
       "      <td>8</td>\n",
       "      <td>2019-07-01</td>\n",
       "      <td>4.00</td>\n",
       "      <td>1</td>\n",
       "      <td>238</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             id                                        name    host_id  \\\n",
       "15705  12685482                    Manhattan - Near Subway!   68929870   \n",
       "36531  29043928                          Brooklyn Sanctuary  218883687   \n",
       "813      289665         Decorators 5-Star Flat West Village    1503831   \n",
       "9541    7327833         Charming Apartment in Crown Heights   27630423   \n",
       "47041  35569459  Luxury Full Floor SoHo Loft | 3 Bed/2 Bath  163029687   \n",
       "7601    5715049                           Park Block Studio   29628369   \n",
       "36447  28978474          New + Sunlit Apartment in Bed-Stuy   34394667   \n",
       "32709  25773391               Beautiful Room in Fort Greene    3687035   \n",
       "10744   8269318  Dreamy Private Bedroom Near Manhattan NYC.   22384027   \n",
       "40332  31283533        New York City Home w/ Full Amenities   39179346   \n",
       "\n",
       "          host_name neighbourhood_group       neighbourhood  latitude  \\\n",
       "15705           Ben           Manhattan              Harlem  40.82617   \n",
       "36531       Parrish            Brooklyn       East Flatbush  40.63931   \n",
       "813            Elle           Manhattan        West Village  40.72891   \n",
       "9541       Julianne            Brooklyn       Crown Heights  40.67464   \n",
       "47041  Anna + Jason           Manhattan                SoHo  40.72434   \n",
       "7601           Nick            Brooklyn          Park Slope  40.66568   \n",
       "36447           Ron            Brooklyn  Bedford-Stuyvesant  40.67782   \n",
       "32709         Abeyu            Brooklyn         Fort Greene  40.68464   \n",
       "10744       Shahana            Brooklyn       Crown Heights  40.67112   \n",
       "40332      Rooshabh           Manhattan             Midtown  40.75414   \n",
       "\n",
       "       longitude        room_type  price  minimum_nights  number_of_reviews  \\\n",
       "15705  -73.95173     Private room     69               2                  0   \n",
       "36531  -73.94451     Private room    125               1                  7   \n",
       "813    -74.00293  Entire home/apt    450              20                157   \n",
       "9541   -73.94276  Entire home/apt    120               2                  2   \n",
       "47041  -73.99945  Entire home/apt    900               3                  1   \n",
       "7601   -73.97740  Entire home/apt    181               2                238   \n",
       "36447  -73.92326  Entire home/apt    125               3                 15   \n",
       "32709  -73.97094     Private room     75               3                  9   \n",
       "10744  -73.91685     Private room     39               1                 73   \n",
       "40332  -73.96595  Entire home/apt    110               1                  8   \n",
       "\n",
       "      last_review  reviews_per_month  calculated_host_listings_count  \\\n",
       "15705         NaN                NaN                               1   \n",
       "36531  2019-01-01               0.76                               2   \n",
       "813    2016-08-11               1.71                               1   \n",
       "9541   2015-10-03               0.04                               1   \n",
       "47041  2019-07-06               1.00                               1   \n",
       "7601   2019-06-23               4.57                               1   \n",
       "36447  2019-06-24               4.33                               1   \n",
       "32709  2019-07-03               0.75                               1   \n",
       "10744  2019-05-10               1.57                              10   \n",
       "40332  2019-07-01               4.00                               1   \n",
       "\n",
       "       availability_365  \n",
       "15705                 0  \n",
       "36531               364  \n",
       "813                   0  \n",
       "9541                  0  \n",
       "47041               292  \n",
       "7601                302  \n",
       "36447               175  \n",
       "32709                 7  \n",
       "10744                33  \n",
       "40332               238  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "original_data.sample(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "58a472bd-7db8-4113-920f-79c58ccf8707",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 48895 entries, 0 to 48894\n",
      "Data columns (total 16 columns):\n",
      " #   Column                          Non-Null Count  Dtype  \n",
      "---  ------                          --------------  -----  \n",
      " 0   id                              48895 non-null  int64  \n",
      " 1   name                            48879 non-null  object \n",
      " 2   host_id                         48895 non-null  int64  \n",
      " 3   host_name                       48874 non-null  object \n",
      " 4   neighbourhood_group             48895 non-null  object \n",
      " 5   neighbourhood                   48895 non-null  object \n",
      " 6   latitude                        48895 non-null  float64\n",
      " 7   longitude                       48895 non-null  float64\n",
      " 8   room_type                       48895 non-null  object \n",
      " 9   price                           48895 non-null  int64  \n",
      " 10  minimum_nights                  48895 non-null  int64  \n",
      " 11  number_of_reviews               48895 non-null  int64  \n",
      " 12  last_review                     38843 non-null  object \n",
      " 13  reviews_per_month               38843 non-null  float64\n",
      " 14  calculated_host_listings_count  48895 non-null  int64  \n",
      " 15  availability_365                48895 non-null  int64  \n",
      "dtypes: float64(3), int64(7), object(6)\n",
      "memory usage: 6.0+ MB\n"
     ]
    }
   ],
   "source": [
    "original_data.info()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "018532a1-32cf-42ef-9d5c-12ddc6382845",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>name</th>\n",
       "      <th>host_id</th>\n",
       "      <th>host_name</th>\n",
       "      <th>neighbourhood_group</th>\n",
       "      <th>neighbourhood</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>room_type</th>\n",
       "      <th>price</th>\n",
       "      <th>minimum_nights</th>\n",
       "      <th>number_of_reviews</th>\n",
       "      <th>last_review</th>\n",
       "      <th>reviews_per_month</th>\n",
       "      <th>calculated_host_listings_count</th>\n",
       "      <th>availability_365</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>2854</th>\n",
       "      <td>1615764</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6676776</td>\n",
       "      <td>Peter</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Battery Park City</td>\n",
       "      <td>40.71239</td>\n",
       "      <td>-74.01620</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>400</td>\n",
       "      <td>1000</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>362</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3703</th>\n",
       "      <td>2232600</td>\n",
       "      <td>NaN</td>\n",
       "      <td>11395220</td>\n",
       "      <td>Anna</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>East Village</td>\n",
       "      <td>40.73215</td>\n",
       "      <td>-73.98821</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>200</td>\n",
       "      <td>1</td>\n",
       "      <td>28</td>\n",
       "      <td>2015-06-08</td>\n",
       "      <td>0.45</td>\n",
       "      <td>1</td>\n",
       "      <td>341</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5775</th>\n",
       "      <td>4209595</td>\n",
       "      <td>NaN</td>\n",
       "      <td>20700823</td>\n",
       "      <td>Jesse</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Greenwich Village</td>\n",
       "      <td>40.73473</td>\n",
       "      <td>-73.99244</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>225</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>2015-01-01</td>\n",
       "      <td>0.02</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5975</th>\n",
       "      <td>4370230</td>\n",
       "      <td>NaN</td>\n",
       "      <td>22686810</td>\n",
       "      <td>Michaël</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Nolita</td>\n",
       "      <td>40.72046</td>\n",
       "      <td>-73.99550</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>215</td>\n",
       "      <td>7</td>\n",
       "      <td>5</td>\n",
       "      <td>2016-01-02</td>\n",
       "      <td>0.09</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6269</th>\n",
       "      <td>4581788</td>\n",
       "      <td>NaN</td>\n",
       "      <td>21600904</td>\n",
       "      <td>Lucie</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Williamsburg</td>\n",
       "      <td>40.71370</td>\n",
       "      <td>-73.94378</td>\n",
       "      <td>Private room</td>\n",
       "      <td>150</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6567</th>\n",
       "      <td>4756856</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1832442</td>\n",
       "      <td>Carolina</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Bushwick</td>\n",
       "      <td>40.70046</td>\n",
       "      <td>-73.92825</td>\n",
       "      <td>Private room</td>\n",
       "      <td>70</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6605</th>\n",
       "      <td>4774658</td>\n",
       "      <td>NaN</td>\n",
       "      <td>24625694</td>\n",
       "      <td>Josh</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Washington Heights</td>\n",
       "      <td>40.85198</td>\n",
       "      <td>-73.93108</td>\n",
       "      <td>Private room</td>\n",
       "      <td>40</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8841</th>\n",
       "      <td>6782407</td>\n",
       "      <td>NaN</td>\n",
       "      <td>31147528</td>\n",
       "      <td>Huei-Yin</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Williamsburg</td>\n",
       "      <td>40.71354</td>\n",
       "      <td>-73.93882</td>\n",
       "      <td>Private room</td>\n",
       "      <td>45</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>11963</th>\n",
       "      <td>9325951</td>\n",
       "      <td>NaN</td>\n",
       "      <td>33377685</td>\n",
       "      <td>Jonathan</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Hell's Kitchen</td>\n",
       "      <td>40.76436</td>\n",
       "      <td>-73.98573</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>190</td>\n",
       "      <td>4</td>\n",
       "      <td>1</td>\n",
       "      <td>2016-01-05</td>\n",
       "      <td>0.02</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12824</th>\n",
       "      <td>9787590</td>\n",
       "      <td>NaN</td>\n",
       "      <td>50448556</td>\n",
       "      <td>Miguel</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Harlem</td>\n",
       "      <td>40.80316</td>\n",
       "      <td>-73.95189</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>300</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>5</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13059</th>\n",
       "      <td>9885866</td>\n",
       "      <td>NaN</td>\n",
       "      <td>37306329</td>\n",
       "      <td>Juliette</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Chinatown</td>\n",
       "      <td>40.71632</td>\n",
       "      <td>-73.99328</td>\n",
       "      <td>Private room</td>\n",
       "      <td>67</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13401</th>\n",
       "      <td>10052289</td>\n",
       "      <td>NaN</td>\n",
       "      <td>49522403</td>\n",
       "      <td>Vanessa</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Brownsville</td>\n",
       "      <td>40.66409</td>\n",
       "      <td>-73.92314</td>\n",
       "      <td>Private room</td>\n",
       "      <td>50</td>\n",
       "      <td>3</td>\n",
       "      <td>3</td>\n",
       "      <td>2016-08-18</td>\n",
       "      <td>0.07</td>\n",
       "      <td>1</td>\n",
       "      <td>362</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15819</th>\n",
       "      <td>12797684</td>\n",
       "      <td>NaN</td>\n",
       "      <td>69715276</td>\n",
       "      <td>Yan</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Upper West Side</td>\n",
       "      <td>40.79843</td>\n",
       "      <td>-73.96404</td>\n",
       "      <td>Private room</td>\n",
       "      <td>100</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>16071</th>\n",
       "      <td>12988898</td>\n",
       "      <td>NaN</td>\n",
       "      <td>71552588</td>\n",
       "      <td>Andrea</td>\n",
       "      <td>Bronx</td>\n",
       "      <td>Fordham</td>\n",
       "      <td>40.86032</td>\n",
       "      <td>-73.88493</td>\n",
       "      <td>Shared room</td>\n",
       "      <td>130</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>365</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18047</th>\n",
       "      <td>14135050</td>\n",
       "      <td>NaN</td>\n",
       "      <td>85288337</td>\n",
       "      <td>Jeff</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Bedford-Stuyvesant</td>\n",
       "      <td>40.69421</td>\n",
       "      <td>-73.93234</td>\n",
       "      <td>Private room</td>\n",
       "      <td>70</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>28889</th>\n",
       "      <td>22275821</td>\n",
       "      <td>NaN</td>\n",
       "      <td>49662398</td>\n",
       "      <td>Kathleen</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Bushwick</td>\n",
       "      <td>40.69546</td>\n",
       "      <td>-73.92741</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>110</td>\n",
       "      <td>4</td>\n",
       "      <td>5</td>\n",
       "      <td>2018-08-13</td>\n",
       "      <td>0.27</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             id name   host_id host_name neighbourhood_group  \\\n",
       "2854    1615764  NaN   6676776     Peter           Manhattan   \n",
       "3703    2232600  NaN  11395220      Anna           Manhattan   \n",
       "5775    4209595  NaN  20700823     Jesse           Manhattan   \n",
       "5975    4370230  NaN  22686810   Michaël           Manhattan   \n",
       "6269    4581788  NaN  21600904     Lucie            Brooklyn   \n",
       "6567    4756856  NaN   1832442  Carolina            Brooklyn   \n",
       "6605    4774658  NaN  24625694      Josh           Manhattan   \n",
       "8841    6782407  NaN  31147528  Huei-Yin            Brooklyn   \n",
       "11963   9325951  NaN  33377685  Jonathan           Manhattan   \n",
       "12824   9787590  NaN  50448556    Miguel           Manhattan   \n",
       "13059   9885866  NaN  37306329  Juliette           Manhattan   \n",
       "13401  10052289  NaN  49522403   Vanessa            Brooklyn   \n",
       "15819  12797684  NaN  69715276       Yan           Manhattan   \n",
       "16071  12988898  NaN  71552588    Andrea               Bronx   \n",
       "18047  14135050  NaN  85288337      Jeff            Brooklyn   \n",
       "28889  22275821  NaN  49662398  Kathleen            Brooklyn   \n",
       "\n",
       "            neighbourhood  latitude  longitude        room_type  price  \\\n",
       "2854    Battery Park City  40.71239  -74.01620  Entire home/apt    400   \n",
       "3703         East Village  40.73215  -73.98821  Entire home/apt    200   \n",
       "5775    Greenwich Village  40.73473  -73.99244  Entire home/apt    225   \n",
       "5975               Nolita  40.72046  -73.99550  Entire home/apt    215   \n",
       "6269         Williamsburg  40.71370  -73.94378     Private room    150   \n",
       "6567             Bushwick  40.70046  -73.92825     Private room     70   \n",
       "6605   Washington Heights  40.85198  -73.93108     Private room     40   \n",
       "8841         Williamsburg  40.71354  -73.93882     Private room     45   \n",
       "11963      Hell's Kitchen  40.76436  -73.98573  Entire home/apt    190   \n",
       "12824              Harlem  40.80316  -73.95189  Entire home/apt    300   \n",
       "13059           Chinatown  40.71632  -73.99328     Private room     67   \n",
       "13401         Brownsville  40.66409  -73.92314     Private room     50   \n",
       "15819     Upper West Side  40.79843  -73.96404     Private room    100   \n",
       "16071             Fordham  40.86032  -73.88493      Shared room    130   \n",
       "18047  Bedford-Stuyvesant  40.69421  -73.93234     Private room     70   \n",
       "28889            Bushwick  40.69546  -73.92741  Entire home/apt    110   \n",
       "\n",
       "       minimum_nights  number_of_reviews last_review  reviews_per_month  \\\n",
       "2854             1000                  0         NaN                NaN   \n",
       "3703                1                 28  2015-06-08               0.45   \n",
       "5775                1                  1  2015-01-01               0.02   \n",
       "5975                7                  5  2016-01-02               0.09   \n",
       "6269                1                  0         NaN                NaN   \n",
       "6567                1                  0         NaN                NaN   \n",
       "6605                1                  0         NaN                NaN   \n",
       "8841                1                  0         NaN                NaN   \n",
       "11963               4                  1  2016-01-05               0.02   \n",
       "12824               5                  0         NaN                NaN   \n",
       "13059               4                  0         NaN                NaN   \n",
       "13401               3                  3  2016-08-18               0.07   \n",
       "15819               1                  0         NaN                NaN   \n",
       "16071               1                  0         NaN                NaN   \n",
       "18047               3                  0         NaN                NaN   \n",
       "28889               4                  5  2018-08-13               0.27   \n",
       "\n",
       "       calculated_host_listings_count  availability_365  \n",
       "2854                                1               362  \n",
       "3703                                1               341  \n",
       "5775                                1                 0  \n",
       "5975                                1                 0  \n",
       "6269                                1                 0  \n",
       "6567                                1                 0  \n",
       "6605                                1                 0  \n",
       "8841                                1                 0  \n",
       "11963                               1                 0  \n",
       "12824                               5                 0  \n",
       "13059                               1                 0  \n",
       "13401                               1               362  \n",
       "15819                               2                 0  \n",
       "16071                               1               365  \n",
       "18047                               1                 0  \n",
       "28889                               1                 0  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "original_data[original_data[\"name\"].isnull()]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "9043cd4a-0cc0-4185-bdd2-3472d0569f23",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "neighbourhood_group\n",
       "Manhattan        21661\n",
       "Brooklyn         20104\n",
       "Queens            5666\n",
       "Bronx             1091\n",
       "Staten Island      373\n",
       "Name: count, dtype: int64"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "original_data[\"neighbourhood_group\"].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "52236856-815d-4a2b-be6d-39e47e0fd698",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>host_id</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>price</th>\n",
       "      <th>minimum_nights</th>\n",
       "      <th>number_of_reviews</th>\n",
       "      <th>reviews_per_month</th>\n",
       "      <th>calculated_host_listings_count</th>\n",
       "      <th>availability_365</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>4.889500e+04</td>\n",
       "      <td>4.889500e+04</td>\n",
       "      <td>48895.000000</td>\n",
       "      <td>48895.000000</td>\n",
       "      <td>48895.000000</td>\n",
       "      <td>48895.000000</td>\n",
       "      <td>48895.000000</td>\n",
       "      <td>38843.000000</td>\n",
       "      <td>48895.000000</td>\n",
       "      <td>48895.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>1.901714e+07</td>\n",
       "      <td>6.762001e+07</td>\n",
       "      <td>40.728949</td>\n",
       "      <td>-73.952170</td>\n",
       "      <td>152.720687</td>\n",
       "      <td>7.029962</td>\n",
       "      <td>23.274466</td>\n",
       "      <td>1.373221</td>\n",
       "      <td>7.143982</td>\n",
       "      <td>112.781327</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>1.098311e+07</td>\n",
       "      <td>7.861097e+07</td>\n",
       "      <td>0.054530</td>\n",
       "      <td>0.046157</td>\n",
       "      <td>240.154170</td>\n",
       "      <td>20.510550</td>\n",
       "      <td>44.550582</td>\n",
       "      <td>1.680442</td>\n",
       "      <td>32.952519</td>\n",
       "      <td>131.622289</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>2.539000e+03</td>\n",
       "      <td>2.438000e+03</td>\n",
       "      <td>40.499790</td>\n",
       "      <td>-74.244420</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.010000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>25%</th>\n",
       "      <td>9.471945e+06</td>\n",
       "      <td>7.822033e+06</td>\n",
       "      <td>40.690100</td>\n",
       "      <td>-73.983070</td>\n",
       "      <td>69.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.190000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>0.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50%</th>\n",
       "      <td>1.967728e+07</td>\n",
       "      <td>3.079382e+07</td>\n",
       "      <td>40.723070</td>\n",
       "      <td>-73.955680</td>\n",
       "      <td>106.000000</td>\n",
       "      <td>3.000000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>0.720000</td>\n",
       "      <td>1.000000</td>\n",
       "      <td>45.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>75%</th>\n",
       "      <td>2.915218e+07</td>\n",
       "      <td>1.074344e+08</td>\n",
       "      <td>40.763115</td>\n",
       "      <td>-73.936275</td>\n",
       "      <td>175.000000</td>\n",
       "      <td>5.000000</td>\n",
       "      <td>24.000000</td>\n",
       "      <td>2.020000</td>\n",
       "      <td>2.000000</td>\n",
       "      <td>227.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>3.648724e+07</td>\n",
       "      <td>2.743213e+08</td>\n",
       "      <td>40.913060</td>\n",
       "      <td>-73.712990</td>\n",
       "      <td>10000.000000</td>\n",
       "      <td>1250.000000</td>\n",
       "      <td>629.000000</td>\n",
       "      <td>58.500000</td>\n",
       "      <td>327.000000</td>\n",
       "      <td>365.000000</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                 id       host_id      latitude     longitude         price  \\\n",
       "count  4.889500e+04  4.889500e+04  48895.000000  48895.000000  48895.000000   \n",
       "mean   1.901714e+07  6.762001e+07     40.728949    -73.952170    152.720687   \n",
       "std    1.098311e+07  7.861097e+07      0.054530      0.046157    240.154170   \n",
       "min    2.539000e+03  2.438000e+03     40.499790    -74.244420      0.000000   \n",
       "25%    9.471945e+06  7.822033e+06     40.690100    -73.983070     69.000000   \n",
       "50%    1.967728e+07  3.079382e+07     40.723070    -73.955680    106.000000   \n",
       "75%    2.915218e+07  1.074344e+08     40.763115    -73.936275    175.000000   \n",
       "max    3.648724e+07  2.743213e+08     40.913060    -73.712990  10000.000000   \n",
       "\n",
       "       minimum_nights  number_of_reviews  reviews_per_month  \\\n",
       "count    48895.000000       48895.000000       38843.000000   \n",
       "mean         7.029962          23.274466           1.373221   \n",
       "std         20.510550          44.550582           1.680442   \n",
       "min          1.000000           0.000000           0.010000   \n",
       "25%          1.000000           1.000000           0.190000   \n",
       "50%          3.000000           5.000000           0.720000   \n",
       "75%          5.000000          24.000000           2.020000   \n",
       "max       1250.000000         629.000000          58.500000   \n",
       "\n",
       "       calculated_host_listings_count  availability_365  \n",
       "count                    48895.000000      48895.000000  \n",
       "mean                         7.143982        112.781327  \n",
       "std                         32.952519        131.622289  \n",
       "min                          1.000000          0.000000  \n",
       "25%                          1.000000          0.000000  \n",
       "50%                          1.000000         45.000000  \n",
       "75%                          2.000000        227.000000  \n",
       "max                        327.000000        365.000000  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "original_data.describe()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "db4186b4-88c8-40da-aee7-c62e442ccca8",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>id</th>\n",
       "      <th>name</th>\n",
       "      <th>host_id</th>\n",
       "      <th>host_name</th>\n",
       "      <th>neighbourhood_group</th>\n",
       "      <th>neighbourhood</th>\n",
       "      <th>latitude</th>\n",
       "      <th>longitude</th>\n",
       "      <th>room_type</th>\n",
       "      <th>price</th>\n",
       "      <th>minimum_nights</th>\n",
       "      <th>number_of_reviews</th>\n",
       "      <th>last_review</th>\n",
       "      <th>reviews_per_month</th>\n",
       "      <th>calculated_host_listings_count</th>\n",
       "      <th>availability_365</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2539</td>\n",
       "      <td>Clean &amp; quiet apt home by the park</td>\n",
       "      <td>2787</td>\n",
       "      <td>John</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Kensington</td>\n",
       "      <td>40.64749</td>\n",
       "      <td>-73.97237</td>\n",
       "      <td>Private room</td>\n",
       "      <td>149</td>\n",
       "      <td>1</td>\n",
       "      <td>9</td>\n",
       "      <td>2018-10-19</td>\n",
       "      <td>0.21</td>\n",
       "      <td>6</td>\n",
       "      <td>365</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2595</td>\n",
       "      <td>Skylit Midtown Castle</td>\n",
       "      <td>2845</td>\n",
       "      <td>Jennifer</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Midtown</td>\n",
       "      <td>40.75362</td>\n",
       "      <td>-73.98377</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>225</td>\n",
       "      <td>1</td>\n",
       "      <td>45</td>\n",
       "      <td>2019-05-21</td>\n",
       "      <td>0.38</td>\n",
       "      <td>2</td>\n",
       "      <td>355</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3647</td>\n",
       "      <td>THE VILLAGE OF HARLEM....NEW YORK !</td>\n",
       "      <td>4632</td>\n",
       "      <td>Elisabeth</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Harlem</td>\n",
       "      <td>40.80902</td>\n",
       "      <td>-73.94190</td>\n",
       "      <td>Private room</td>\n",
       "      <td>150</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>365</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3831</td>\n",
       "      <td>Cozy Entire Floor of Brownstone</td>\n",
       "      <td>4869</td>\n",
       "      <td>LisaRoxanne</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Clinton Hill</td>\n",
       "      <td>40.68514</td>\n",
       "      <td>-73.95976</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>89</td>\n",
       "      <td>1</td>\n",
       "      <td>270</td>\n",
       "      <td>2019-07-05</td>\n",
       "      <td>4.64</td>\n",
       "      <td>1</td>\n",
       "      <td>194</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5022</td>\n",
       "      <td>Entire Apt: Spacious Studio/Loft by central park</td>\n",
       "      <td>7192</td>\n",
       "      <td>Laura</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>East Harlem</td>\n",
       "      <td>40.79851</td>\n",
       "      <td>-73.94399</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>80</td>\n",
       "      <td>10</td>\n",
       "      <td>9</td>\n",
       "      <td>2018-11-19</td>\n",
       "      <td>0.10</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48890</th>\n",
       "      <td>36484665</td>\n",
       "      <td>Charming one bedroom - newly renovated rowhouse</td>\n",
       "      <td>8232441</td>\n",
       "      <td>Sabrina</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Bedford-Stuyvesant</td>\n",
       "      <td>40.67853</td>\n",
       "      <td>-73.94995</td>\n",
       "      <td>Private room</td>\n",
       "      <td>70</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2</td>\n",
       "      <td>9</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48891</th>\n",
       "      <td>36485057</td>\n",
       "      <td>Affordable room in Bushwick/East Williamsburg</td>\n",
       "      <td>6570630</td>\n",
       "      <td>Marisol</td>\n",
       "      <td>Brooklyn</td>\n",
       "      <td>Bushwick</td>\n",
       "      <td>40.70184</td>\n",
       "      <td>-73.93317</td>\n",
       "      <td>Private room</td>\n",
       "      <td>40</td>\n",
       "      <td>4</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>2</td>\n",
       "      <td>36</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48892</th>\n",
       "      <td>36485431</td>\n",
       "      <td>Sunny Studio at Historical Neighborhood</td>\n",
       "      <td>23492952</td>\n",
       "      <td>Ilgar &amp; Aysel</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Harlem</td>\n",
       "      <td>40.81475</td>\n",
       "      <td>-73.94867</td>\n",
       "      <td>Entire home/apt</td>\n",
       "      <td>115</td>\n",
       "      <td>10</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>27</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48893</th>\n",
       "      <td>36485609</td>\n",
       "      <td>43rd St. Time Square-cozy single bed</td>\n",
       "      <td>30985759</td>\n",
       "      <td>Taz</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Hell's Kitchen</td>\n",
       "      <td>40.75751</td>\n",
       "      <td>-73.99112</td>\n",
       "      <td>Shared room</td>\n",
       "      <td>55</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>6</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48894</th>\n",
       "      <td>36487245</td>\n",
       "      <td>Trendy duplex in the very heart of Hell's Kitchen</td>\n",
       "      <td>68119814</td>\n",
       "      <td>Christophe</td>\n",
       "      <td>Manhattan</td>\n",
       "      <td>Hell's Kitchen</td>\n",
       "      <td>40.76404</td>\n",
       "      <td>-73.98933</td>\n",
       "      <td>Private room</td>\n",
       "      <td>90</td>\n",
       "      <td>7</td>\n",
       "      <td>0</td>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "      <td>1</td>\n",
       "      <td>23</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>48895 rows × 16 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "             id                                               name   host_id  \\\n",
       "0          2539                 Clean & quiet apt home by the park      2787   \n",
       "1          2595                              Skylit Midtown Castle      2845   \n",
       "2          3647                THE VILLAGE OF HARLEM....NEW YORK !      4632   \n",
       "3          3831                    Cozy Entire Floor of Brownstone      4869   \n",
       "4          5022   Entire Apt: Spacious Studio/Loft by central park      7192   \n",
       "...         ...                                                ...       ...   \n",
       "48890  36484665    Charming one bedroom - newly renovated rowhouse   8232441   \n",
       "48891  36485057      Affordable room in Bushwick/East Williamsburg   6570630   \n",
       "48892  36485431            Sunny Studio at Historical Neighborhood  23492952   \n",
       "48893  36485609               43rd St. Time Square-cozy single bed  30985759   \n",
       "48894  36487245  Trendy duplex in the very heart of Hell's Kitchen  68119814   \n",
       "\n",
       "           host_name neighbourhood_group       neighbourhood  latitude  \\\n",
       "0               John            Brooklyn          Kensington  40.64749   \n",
       "1           Jennifer           Manhattan             Midtown  40.75362   \n",
       "2          Elisabeth           Manhattan              Harlem  40.80902   \n",
       "3        LisaRoxanne            Brooklyn        Clinton Hill  40.68514   \n",
       "4              Laura           Manhattan         East Harlem  40.79851   \n",
       "...              ...                 ...                 ...       ...   \n",
       "48890        Sabrina            Brooklyn  Bedford-Stuyvesant  40.67853   \n",
       "48891        Marisol            Brooklyn            Bushwick  40.70184   \n",
       "48892  Ilgar & Aysel           Manhattan              Harlem  40.81475   \n",
       "48893            Taz           Manhattan      Hell's Kitchen  40.75751   \n",
       "48894     Christophe           Manhattan      Hell's Kitchen  40.76404   \n",
       "\n",
       "       longitude        room_type  price  minimum_nights  number_of_reviews  \\\n",
       "0      -73.97237     Private room    149               1                  9   \n",
       "1      -73.98377  Entire home/apt    225               1                 45   \n",
       "2      -73.94190     Private room    150               3                  0   \n",
       "3      -73.95976  Entire home/apt     89               1                270   \n",
       "4      -73.94399  Entire home/apt     80              10                  9   \n",
       "...          ...              ...    ...             ...                ...   \n",
       "48890  -73.94995     Private room     70               2                  0   \n",
       "48891  -73.93317     Private room     40               4                  0   \n",
       "48892  -73.94867  Entire home/apt    115              10                  0   \n",
       "48893  -73.99112      Shared room     55               1                  0   \n",
       "48894  -73.98933     Private room     90               7                  0   \n",
       "\n",
       "      last_review  reviews_per_month  calculated_host_listings_count  \\\n",
       "0      2018-10-19               0.21                               6   \n",
       "1      2019-05-21               0.38                               2   \n",
       "2             NaN                NaN                               1   \n",
       "3      2019-07-05               4.64                               1   \n",
       "4      2018-11-19               0.10                               1   \n",
       "...           ...                ...                             ...   \n",
       "48890         NaN                NaN                               2   \n",
       "48891         NaN                NaN                               2   \n",
       "48892         NaN                NaN                               1   \n",
       "48893         NaN                NaN                               6   \n",
       "48894         NaN                NaN                               1   \n",
       "\n",
       "       availability_365  \n",
       "0                   365  \n",
       "1                   355  \n",
       "2                   365  \n",
       "3                   194  \n",
       "4                     0  \n",
       "...                 ...  \n",
       "48890                 9  \n",
       "48891                36  \n",
       "48892                27  \n",
       "48893                 2  \n",
       "48894                23  \n",
       "\n",
       "[48895 rows x 16 columns]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "original_data[original_data[\"longitude\"] < 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "220be921-65d5-468b-88e7-93643d290c82",
   "metadata": {},
   "source": [
    "卡壳啦！！！！呜呜呜呜"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "78881128-ea51-4657-b9ea-51d8d5f27d6c",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
